Overview

Dataset statistics

Number of variables10
Number of observations44212225
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.3 GiB
Average record size in memory80.0 B

Variable types

Categorical2
Numeric8

Warnings

PRODUCTID has a high cardinality: 1310 distinct values High cardinality
CURRENTPRICE is highly correlated with LISTPRICEHigh correlation
LISTPRICE is highly correlated with CURRENTPRICEHigh correlation
CURRENTPRICE is highly correlated with LISTPRICEHigh correlation
LISTPRICE is highly correlated with CURRENTPRICEHigh correlation
CURRENTPRICE is highly correlated with LISTPRICEHigh correlation
LISTPRICE is highly correlated with CURRENTPRICEHigh correlation
DATE is highly correlated with LOCATIONIDHigh correlation
UNITSSOLD is highly correlated with LISTPRICE and 1 other fieldsHigh correlation
LISTPRICE is highly correlated with UNITSSOLD and 1 other fieldsHigh correlation
CURRENTPRICE is highly correlated with UNITSSOLD and 1 other fieldsHigh correlation
LOCATIONID is highly correlated with DATEHigh correlation
AVAILABLEQUANTITY is highly skewed (γ1 = 219.0416833) Skewed
SALEPRICE is highly skewed (γ1 = 256.1790504) Skewed
UNITSRESTOCKED is highly skewed (γ1 = 694.084977) Skewed
AVAILABLEQUANTITY has 41347982 (93.5%) zeros Zeros
SALEPRICE has 44050686 (99.6%) zeros Zeros
UNITSRESTOCKED has 44097689 (99.7%) zeros Zeros
UNITSSOLD has 24122772 (54.6%) zeros Zeros

Reproduction

Analysis started2021-06-20 20:57:05.703968
Analysis finished2021-06-21 01:19:02.337283
Duration4 hours, 21 minutes and 56.63 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

DATE
Categorical

HIGH CORRELATION

Distinct37
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size337.3 MiB
2021-03-09
 
2813952
2021-03-13
 
2793472
2020-11-07
 
2302548
2021-02-23
 
2252800
2021-01-18
 
2244608
Other values (32)
31804845 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters442122250
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-11-07
2nd row2020-11-07
3rd row2020-11-07
4th row2020-11-07
5th row2020-11-07

Common Values

ValueCountFrequency (%)
2021-03-092813952
 
6.4%
2021-03-132793472
 
6.3%
2020-11-072302548
 
5.2%
2021-02-232252800
 
5.1%
2021-01-182244608
 
5.1%
2021-03-312114796
 
4.8%
2021-02-282093056
 
4.7%
2021-02-072079976
 
4.7%
2020-12-051990891
 
4.5%
2020-12-081947341
 
4.4%
Other values (27)21578785
48.8%

Length

2021-06-20T21:19:02.594965image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2021-03-092813952
 
6.4%
2021-03-132793472
 
6.3%
2020-11-072302548
 
5.2%
2021-02-232252800
 
5.1%
2021-01-182244608
 
5.1%
2021-03-312114796
 
4.8%
2021-02-282093056
 
4.7%
2021-02-072079976
 
4.7%
2020-12-051990891
 
4.5%
2020-12-081947341
 
4.4%
Other values (27)21578785
48.8%

Most occurring characters

ValueCountFrequency (%)
2117956760
26.7%
0111941456
25.3%
-88424450
20.0%
170259068
15.9%
323069134
 
5.2%
76765481
 
1.5%
86597175
 
1.5%
56421781
 
1.5%
64325294
 
1.0%
93696880
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number353697800
80.0%
Dash Punctuation88424450
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2117956760
33.3%
0111941456
31.6%
170259068
19.9%
323069134
 
6.5%
76765481
 
1.9%
86597175
 
1.9%
56421781
 
1.8%
64325294
 
1.2%
93696880
 
1.0%
42664771
 
0.8%
Dash Punctuation
ValueCountFrequency (%)
-88424450
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common442122250
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2117956760
26.7%
0111941456
25.3%
-88424450
20.0%
170259068
15.9%
323069134
 
5.2%
76765481
 
1.5%
86597175
 
1.5%
56421781
 
1.5%
64325294
 
1.0%
93696880
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII442122250
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2117956760
26.7%
0111941456
25.3%
-88424450
20.0%
170259068
15.9%
323069134
 
5.2%
76765481
 
1.5%
86597175
 
1.5%
56421781
 
1.5%
64325294
 
1.0%
93696880
 
0.8%

LOCATIONID
Real number (ℝ≥0)

HIGH CORRELATION

Distinct623
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21808.07437
Minimum110
Maximum69101
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size337.3 MiB
2021-06-20T21:19:02.716140image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum110
5-th percentile321
Q110412
median11911
Q345010
95-th percentile60128
Maximum69101
Range68991
Interquartile range (IQR)34598

Descriptive statistics

Standard deviation19169.41152
Coefficient of variation (CV)0.8790052344
Kurtosis-0.5259138488
Mean21808.07437
Median Absolute Deviation (MAD)1746
Skewness0.9912693172
Sum9.641834908 × 1011
Variance367466338.1
MonotonicityNot monotonic
2021-06-20T21:19:02.848191image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10134142494
 
0.3%
10130140019
 
0.3%
10126140019
 
0.3%
10127140019
 
0.3%
10120140019
 
0.3%
10136140019
 
0.3%
10137140019
 
0.3%
10162138382
 
0.3%
10160127290
 
0.3%
10171127290
 
0.3%
Other values (613)42836655
96.9%
ValueCountFrequency (%)
11050916
0.1%
12063645
0.1%
12157296
0.1%
12950916
0.1%
13150916
0.1%
13338187
0.1%
13438187
0.1%
14138187
0.1%
14512729
 
< 0.1%
16050916
0.1%
ValueCountFrequency (%)
69101109627
0.2%
68101101832
0.2%
66101114561
0.3%
65201101832
0.2%
65200114561
0.3%
65190101832
0.2%
65144101832
0.2%
65143101832
0.2%
65142111242
0.3%
6514089103
0.2%

SKUID
Real number (ℝ≥0)

Distinct12729
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean102744136.2
Minimum3371671
Maximum124400049
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size337.3 MiB
2021-06-20T21:19:02.982797image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum3371671
5-th percentile3795994
Q1107403584
median114806206
Q3117251763
95-th percentile122175059
Maximum124400049
Range121028378
Interquartile range (IQR)9848179

Descriptive statistics

Standard deviation33836561.73
Coefficient of variation (CV)0.3293283977
Kurtosis4.547442146
Mean102744136.2
Median Absolute Deviation (MAD)2594805
Skewness-2.521182011
Sum4.542546865 × 1015
Variance1.14491291 × 1015
MonotonicityNot monotonic
2021-06-20T21:19:03.123764image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1137750423486
 
< 0.1%
1147511053486
 
< 0.1%
1067000503484
 
< 0.1%
1073793763484
 
< 0.1%
1228755933484
 
< 0.1%
1174031973484
 
< 0.1%
36749793484
 
< 0.1%
1147501693483
 
< 0.1%
1044569213483
 
< 0.1%
1172503303483
 
< 0.1%
Other values (12719)44177384
99.9%
ValueCountFrequency (%)
33716713474
< 0.1%
33716723472
< 0.1%
33716733471
< 0.1%
33716773472
< 0.1%
33716783469
< 0.1%
33716793477
< 0.1%
34078943474
< 0.1%
34078953470
< 0.1%
34078973475
< 0.1%
34078983471
< 0.1%
ValueCountFrequency (%)
1244000493473
< 0.1%
1244000473470
< 0.1%
1244000453474
< 0.1%
1244000443471
< 0.1%
1244000433473
< 0.1%
1239500593476
< 0.1%
1239500583472
< 0.1%
1239500573471
< 0.1%
1239500523481
< 0.1%
1239500483474
< 0.1%

PRODUCTID
Categorical

HIGH CARDINALITY

Distinct1310
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size337.3 MiB
prod8610600
 
326237
prod8551592
 
281751
prod2020012
 
251865
prod8551591
 
248348
prod8351133
 
245167
Other values (1305)
42858857 

Length

Max length12
Median length11
Mean length11.12147303
Min length10

Characters and Unicode

Total characters491705068
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowprod9280136
2nd rowprod3470051
3rd rowprod9390007
4th rowprod9890061
5th rowprod9280332

Common Values

ValueCountFrequency (%)
prod8610600326237
 
0.7%
prod8551592281751
 
0.6%
prod2020012251865
 
0.6%
prod8551591248348
 
0.6%
prod8351133245167
 
0.6%
prod8360162237883
 
0.5%
prod8610597236552
 
0.5%
prod2810229209786
 
0.5%
prod9710083208408
 
0.5%
prod8780491198649
 
0.4%
Other values (1300)41767579
94.5%

Length

2021-06-20T21:19:03.375604image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
prod8610600326237
 
0.7%
prod8551592281751
 
0.6%
prod2020012251865
 
0.6%
prod8551591248348
 
0.6%
prod8351133245167
 
0.6%
prod8360162237883
 
0.5%
prod8610597236552
 
0.5%
prod2810229209786
 
0.5%
prod9710083208408
 
0.5%
prod8780491198649
 
0.4%
Other values (1300)41767579
94.5%

Most occurring characters

ValueCountFrequency (%)
090309716
18.4%
p44212225
9.0%
r44212225
9.0%
o44212225
9.0%
d44212225
9.0%
939086138
7.9%
132061467
 
6.5%
829367077
 
6.0%
227019427
 
5.5%
620568350
 
4.2%
Other values (4)76443993
15.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number314856168
64.0%
Lowercase Letter176848900
36.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
090309716
28.7%
939086138
12.4%
132061467
 
10.2%
829367077
 
9.3%
227019427
 
8.6%
620568350
 
6.5%
320193665
 
6.4%
519867698
 
6.3%
718476341
 
5.9%
417906289
 
5.7%
Lowercase Letter
ValueCountFrequency (%)
p44212225
25.0%
r44212225
25.0%
o44212225
25.0%
d44212225
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common314856168
64.0%
Latin176848900
36.0%

Most frequent character per script

Common
ValueCountFrequency (%)
090309716
28.7%
939086138
12.4%
132061467
 
10.2%
829367077
 
9.3%
227019427
 
8.6%
620568350
 
6.5%
320193665
 
6.4%
519867698
 
6.3%
718476341
 
5.9%
417906289
 
5.7%
Latin
ValueCountFrequency (%)
p44212225
25.0%
r44212225
25.0%
o44212225
25.0%
d44212225
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII491705068
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
090309716
18.4%
p44212225
9.0%
r44212225
9.0%
o44212225
9.0%
d44212225
9.0%
939086138
7.9%
132061467
 
6.5%
829367077
 
6.0%
227019427
 
5.5%
620568350
 
4.2%
Other values (4)76443993
15.5%

AVAILABLEQUANTITY
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct275
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2131558409
Minimum0
Maximum1822
Zeros41347982
Zeros (%)93.5%
Negative0
Negative (%)0.0%
Memory size337.3 MiB
2021-06-20T21:19:03.499901image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum1822
Range1822
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.639859138
Coefficient of variation (CV)7.693240452
Kurtosis192153.5216
Mean0.2131558409
Median Absolute Deviation (MAD)0
Skewness219.0416833
Sum9424094
Variance2.689137993
MonotonicityNot monotonic
2021-06-20T21:19:03.623656image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
041347982
93.5%
1908743
 
2.1%
2668073
 
1.5%
3457784
 
1.0%
4305099
 
0.7%
5176737
 
0.4%
6103881
 
0.2%
763167
 
0.1%
842679
 
0.1%
928691
 
0.1%
Other values (265)109389
 
0.2%
ValueCountFrequency (%)
041347982
93.5%
1908743
 
2.1%
2668073
 
1.5%
3457784
 
1.0%
4305099
 
0.7%
5176737
 
0.4%
6103881
 
0.2%
763167
 
0.1%
842679
 
0.1%
928691
 
0.1%
ValueCountFrequency (%)
18221
< 0.1%
16871
< 0.1%
16842
< 0.1%
16821
< 0.1%
16752
< 0.1%
8981
< 0.1%
6671
< 0.1%
6311
< 0.1%
5491
< 0.1%
4732
< 0.1%

SALEPRICE
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct38
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.004348706721
Minimum0
Maximum182
Zeros44050686
Zeros (%)99.6%
Negative0
Negative (%)0.0%
Memory size337.3 MiB
2021-06-20T21:19:03.742781image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum182
Range182
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.09123389259
Coefficient of variation (CV)20.97954598
Kurtosis389108.1135
Mean0.004348706721
Median Absolute Deviation (MAD)0
Skewness256.1790504
Sum192266
Variance0.008323623158
MonotonicityNot monotonic
2021-06-20T21:19:03.857592image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
044050686
99.6%
1142307
 
0.3%
213999
 
< 0.1%
32975
 
< 0.1%
41088
 
< 0.1%
5504
 
< 0.1%
6250
 
< 0.1%
7131
 
< 0.1%
888
 
< 0.1%
951
 
< 0.1%
Other values (28)146
 
< 0.1%
ValueCountFrequency (%)
044050686
99.6%
1142307
 
0.3%
213999
 
< 0.1%
32975
 
< 0.1%
41088
 
< 0.1%
5504
 
< 0.1%
6250
 
< 0.1%
7131
 
< 0.1%
888
 
< 0.1%
951
 
< 0.1%
ValueCountFrequency (%)
1821
< 0.1%
721
< 0.1%
661
< 0.1%
641
< 0.1%
442
< 0.1%
402
< 0.1%
381
< 0.1%
371
< 0.1%
361
< 0.1%
331
< 0.1%

UNITSRESTOCKED
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct88
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.004878424463
Minimum0
Maximum406
Zeros44097689
Zeros (%)99.7%
Negative0
Negative (%)0.0%
Memory size337.3 MiB
2021-06-20T21:19:03.982084image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum406
Range406
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2057122327
Coefficient of variation (CV)42.16776016
Kurtosis1100510.526
Mean0.004878424463
Median Absolute Deviation (MAD)0
Skewness694.084977
Sum215686
Variance0.04231752269
MonotonicityNot monotonic
2021-06-20T21:19:04.109671image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
044097689
99.7%
177220
 
0.2%
218520
 
< 0.1%
37951
 
< 0.1%
44152
 
< 0.1%
51971
 
< 0.1%
61317
 
< 0.1%
7801
 
< 0.1%
8588
 
< 0.1%
9394
 
< 0.1%
Other values (78)1622
 
< 0.1%
ValueCountFrequency (%)
044097689
99.7%
177220
 
0.2%
218520
 
< 0.1%
37951
 
< 0.1%
44152
 
< 0.1%
51971
 
< 0.1%
61317
 
< 0.1%
7801
 
< 0.1%
8588
 
< 0.1%
9394
 
< 0.1%
ValueCountFrequency (%)
4061
< 0.1%
3941
< 0.1%
3921
< 0.1%
2631
< 0.1%
1981
< 0.1%
1961
< 0.1%
1941
< 0.1%
1721
< 0.1%
1331
< 0.1%
1281
< 0.1%

CURRENTPRICE
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct67
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73.18668237
Minimum7
Maximum299
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size337.3 MiB
2021-06-20T21:19:04.241667image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile19
Q149
median68
Q398
95-th percentile128
Maximum299
Range292
Interquartile range (IQR)49

Descriptive statistics

Standard deviation35.73652885
Coefficient of variation (CV)0.4882927836
Kurtosis2.177769492
Mean73.18668237
Median Absolute Deviation (MAD)20
Skewness0.9355204537
Sum3235746068
Variance1277.099495
MonotonicityNot monotonic
2021-06-20T21:19:04.365702image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1283772361
 
8.5%
583589811
 
8.1%
683137423
 
7.1%
982844978
 
6.4%
492769499
 
6.3%
692725760
 
6.2%
592725515
 
6.2%
392500704
 
5.7%
882030531
 
4.6%
1181967877
 
4.5%
Other values (57)16147766
36.5%
ValueCountFrequency (%)
74797
 
< 0.1%
8166713
 
0.4%
9496643
1.1%
103475
 
< 0.1%
113474
 
< 0.1%
12138918
 
0.3%
14332198
0.8%
156945
 
< 0.1%
1624480
 
0.1%
18283718
0.6%
ValueCountFrequency (%)
2993480
 
< 0.1%
2987280
 
< 0.1%
248149135
0.3%
244911
 
< 0.1%
2383473
 
< 0.1%
22883852
0.2%
1991821
 
< 0.1%
198115488
0.3%
17911148
 
< 0.1%
17829252
 
0.1%

LISTPRICE
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct49
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean90.08619421
Minimum8
Maximum598
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size337.3 MiB
2021-06-20T21:19:04.494120image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile38
Q158
median88
Q3118
95-th percentile148
Maximum598
Range590
Interquartile range (IQR)60

Descriptive statistics

Standard deviation38.92280054
Coefficient of variation (CV)0.4320617702
Kurtosis4.577780375
Mean90.08619421
Median Absolute Deviation (MAD)30
Skewness0.9698655916
Sum3982911088
Variance1514.984402
MonotonicityNot monotonic
2021-06-20T21:19:04.612621image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
1286546467
14.8%
585514415
12.5%
985464857
12.4%
685042626
11.4%
884021604
9.1%
1183561788
8.1%
1082187571
 
4.9%
482008986
 
4.5%
781967571
 
4.5%
1481141788
 
2.6%
Other values (39)6754552
15.3%
ValueCountFrequency (%)
8166713
 
0.4%
103475
 
< 0.1%
113474
 
< 0.1%
12138918
 
0.3%
14298771
0.7%
156945
 
< 0.1%
1638209
 
0.1%
18607840
1.4%
2086836
 
0.2%
2272923
 
0.2%
ValueCountFrequency (%)
5983480
 
< 0.1%
3983471
 
< 0.1%
3483472
 
< 0.1%
29852095
 
0.1%
2683473
 
< 0.1%
248218842
0.5%
2383473
 
< 0.1%
228163280
 
0.4%
198461916
1.0%
178152795
 
0.3%

UNITSSOLD
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct34
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.56164179
Minimum0
Maximum299
Zeros24122772
Zeros (%)54.6%
Negative0
Negative (%)0.0%
Memory size337.3 MiB
2021-06-20T21:19:04.730255image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q349
95-th percentile89
Maximum299
Range299
Interquartile range (IQR)49

Descriptive statistics

Standard deviation32.66404021
Coefficient of variation (CV)1.277853765
Kurtosis0.4258587312
Mean25.56164179
Median Absolute Deviation (MAD)0
Skewness1.01469604
Sum1130137058
Variance1066.939523
MonotonicityNot monotonic
2021-06-20T21:19:04.836779image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
024122772
54.6%
492769499
 
6.3%
692725760
 
6.2%
592725515
 
6.2%
392500704
 
5.7%
791788788
 
4.0%
291600686
 
3.6%
891366068
 
3.1%
191060321
 
2.4%
99762788
 
1.7%
Other values (24)2789324
 
6.3%
ValueCountFrequency (%)
024122772
54.6%
74797
 
< 0.1%
9496643
 
1.1%
14159628
 
0.4%
191060321
 
2.4%
24161279
 
0.4%
291600686
 
3.6%
34422595
 
1.0%
392500704
 
5.7%
44374443
 
0.8%
ValueCountFrequency (%)
2993480
 
< 0.1%
244911
 
< 0.1%
1991821
 
< 0.1%
17911148
 
< 0.1%
16927586
 
0.1%
15921906
 
< 0.1%
14912062
 
< 0.1%
139121512
0.3%
13445112
 
0.1%
1293496
 
< 0.1%

Interactions

2021-06-20T21:06:17.092428image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:06:27.759746image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:06:35.756576image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:06:43.675468image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:06:50.785693image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:06:58.960312image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:07:07.857184image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:07:15.555561image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:07:22.681273image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:07:29.694176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:07:37.529772image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:07:45.386048image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:07:53.782341image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:08:02.546478image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:08:11.416551image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:08:19.417535image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:08:26.738633image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:08:34.473021image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:08:41.296953image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:08:48.923751image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:08:56.534111image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:09:04.870467image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:09:13.273990image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:09:21.001337image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:09:28.768151image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:09:36.915343image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:09:44.245781image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:09:51.189244image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:09:58.472259image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:10:06.508423image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:10:14.523475image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:10:22.702884image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:10:30.223147image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:10:38.475178image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:10:46.027627image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:10:53.455598image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:11:00.654276image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:11:08.732457image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:11:16.906713image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:11:24.652791image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:11:32.192931image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:11:40.179230image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:11:47.508823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:11:54.885044image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:12:02.256928image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:12:09.859600image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:12:18.026866image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:12:25.885436image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:12:33.550315image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:12:41.627761image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:12:49.015098image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:12:56.375937image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:13:03.748534image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:13:11.927446image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:13:19.451564image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:13:27.320191image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:13:34.805299image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:13:42.820869image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:13:50.199494image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:13:57.716394image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:14:04.981611image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:14:13.047819image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:14:21.170425image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-20T21:14:28.617690image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-06-20T21:19:04.957478image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-20T21:19:05.123465image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-20T21:19:05.263565image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-20T21:19:05.417129image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-06-20T21:14:33.405640image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-20T21:15:18.396256image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

DATELOCATIONIDSKUIDPRODUCTIDAVAILABLEQUANTITYSALEPRICEUNITSRESTOCKEDCURRENTPRICELISTPRICEUNITSSOLD
02020-11-07268113800939prod928013600.009919899.0
12020-11-072683777889prod347005100.001281280.0
22020-11-07268107376607prod939000700.001281280.0
32020-11-07268117377502prod989006100.009912899.0
42020-11-07268113775041prod928033200.00192819.0
52020-11-07268101777165prod843090200.0098980.0
62020-11-07268115650177prod937010900.0098980.0
72020-11-07268122825362prod971008300.0058580.0
82020-11-07268122851265prod1003020700.0068680.0
92020-11-07268117400738prod878049100.0088880.0

Last rows

DATELOCATIONIDSKUIDPRODUCTIDAVAILABLEQUANTITYSALEPRICEUNITSRESTOCKEDCURRENTPRICELISTPRICEUNITSSOLD
442122152020-11-05212117376453prod855545400.0058580.0
442122162020-11-05212117401608prod1012001800.00648864.0
442122172020-11-052123764063prod830020600.001281280.0
442122182020-11-05212118925434prod996076400.001681680.0
442122192020-11-052123821704prod835113300.001181180.0
442122202020-11-05212104457186prod847002300.001181180.0
442122212020-11-052123705917prod835106300.0044440.0
442122222020-11-05212112900039prod920002700.0098980.0
442122232020-11-05212117226881prod1000002400.00498849.0
442122242020-11-05212107383031prod209010800.0068680.0